System and method for array-based data storage and search

ABSTRACT

Provided are computer devices and methods for effectively generating and updating a sorted array for quick data access. The array allocates more space than required by the elements it stores. In other words, the array leaves empty spaces between elements such that insertion of a new element only requires the shifting of a small number, or even none, of the existing elements in the array.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. patentapplication Ser. No. 14/107,863, filed Dec. 16, 2013, the content ofwhich is hereby incorporated by reference into the present disclosure.

BACKGROUND

Array is a common, useful, and important data structure for computersystems to store and process data. An array is a set of consecutivememory locations grouped under one name, where each individual locationis accessed by its index or subscript. It is typically used in computerscience to implement static lookup tables to hold multiple values whichhave the same data type. Sorting an array is useful in organizing datain ordered form and recovering them rapidly.

Elements within a sorted array are found using a binary search, in O(logn); thus sorted arrays are suited for cases when one needs to be able tolook up elements quickly. The complexity for lookups is the same as forself-balancing binary search trees.

Inserting and deleting elements in a sorted array, however, is known tobe costly. The insertion and deletion of elements in a sorted arrayexecutes at O(n). This is due to the need to shift all the elementsfollowing the element to be inserted or deleted. In comparison, aself-balancing binary search tree inserts and deletes at O(log n).

For instance, when establishing a new sorted array with n elements, thecost is O(n²), for inserting all n elements, one-by-one, to the array.When n is large, the cost is prohibitively high and renders regulararray useless for storing sorted data elements.

Arrays, however, are simple and have good quality in reference oflocality. In modern computer systems, arrays may take advantage of thecache memory to have good performance.

SUMMARY

The present disclosure provides computer systems and methods forcreating and using a sorted array that is quick to search and also quickto form and update. The new data structure implemented in the presentcomputer systems and methods is referred to as “Sorted Elastic Array”(SEA). SEA allocates more array cells than actually stored elements butit does not contain all the pointers as in binary search trees ormulti-ways trees so it is still memory efficient. More importantly, itimproves the insertion operation from O(n) to O(log n).

In one embodiment, therefore, provided is a computer system forinserting an input data element to an array, comprising a processor,memory and program instructions which, when executed by the processor,configure the system to: (a) load an array into memory, wherein thearray has a size of B for storing, at a maximum, B data elements,wherein the array contains T data elements, and wherein the dataelements are sorted in the array such that any position in the array, ifnot empty, contains a data element that is greater than all dataelements stored in positions on the left and smaller than all dataelements stored in positions on the right; (b) identify, in the array, afirst data element as the greatest among all data elements smaller thanthe input data element and a second data element as the smallest amongall data elements greater than the input data element, wherein the firstand second data elements are adjacent to each other or separated by oneor more empty positions therebetween; and (c) (i) if the first andsecond data elements are located in adjacent positions, shift the firstdata elements and all adjacent data elements on the left to one positionto the left and then placing the input data element in the positionvacated by the first data element, or shift the second data elements andall adjacent data elements on the right to one position to the right andthen placing the input data element in the position vacated by thesecond data element, or (ii) place the input data element in a positionbetween the first and second data elements.

In some aspects, B is at least 20% greater than T. In some aspects, thefirst and second data elements are separated by at least three emptypositions and the input data element is placed in the middle positionamong the empty positions.

In some aspects, the program instructions further configure the systemto, when a ratio f=T/B is greater than a predetermined threshold,enlarge the size of the array to L by allocating memory for L-Badditional positions to the array. In some aspects, the predeterminedthreshold is between 0.5 and 0.9.

In some aspects, the program instructions further configure the systemto shift one or more data elements in the array so that empty positionsin the array are more evenly distributed.

In some aspects, the program instructions further configure the systemto repeat steps (b) and (c) for one or more new input data elements. Insome aspects, for different input data elements, both steps (i) and (ii)are carried out.

In some aspects, the program instructions further configure the systemto insert into a second index array, at position p, a value i, wherein pis a hash function output of the input data element and i is theposition of the input data element in the array. In some aspects, thearray and the second index array are of the same length.

In some aspects, the hash function generates a unique non-negativeinteger value for each data element in the array.

Also provided, in one embodiment, is a computer system for deleting aquery data element from an array, comprising a processor, memory andprogram instructions which, when executed by the processor, configurethe system to: (a) access an array that contains T data elements,wherein the array has a size of B for storing, at a maximum, B dataelements, and wherein the data elements are sorted in the array suchthat any position in the array, if not empty, contains a data elementthat is greater than all data elements stored in positions on the leftand smaller than all data elements stored in positions on the right; (d)identify a position in the array for containing a data element equal tothe query data element; and (c) mark the position as empty.

In some aspects, the program instructions further configure the systemto, when a ratio f=T/B is smaller than a predetermined threshold, shiftdata elements in the array so that one end of the array has one or moreempty positions, and deleting the empty positions from the array,thereby shrinking the size of the array. In some aspects, thepredetermined threshold is between 0.1 and 0.7. In some aspects, thenumber of empty position at the end is at least 10% of B.

BRIEF DESCRIPTION OF THE DRAWINGS

Provided as embodiments of this disclosure are drawings which illustrateby exemplification only, and not limitation, wherein:

FIG. 1 is a diagram illustrating an array of memory cells containingfour data elements and five empty array cells.

FIG. 2 is a diagram illustrating an array of memory cells containingfour references to external data elements and five empty array cells.

FIG. 3 is a diagram illustrating resizing (growing) an elastic arrayfrom length B to length L.

FIG. 4 is a diagram illustrating the dimension and location of dataelements after remapping the elements from an old array to a new array;and

FIG. 5 is a diagram illustrating the data elements in an elastic arrayand corresponding index hash elements.

It will be recognized that some or all of the figures are schematicrepresentations for exemplification and, hence, that they do notnecessarily depict the actual relative sizes or locations of theelements shown. The figures are presented for the purpose ofillustrating one or more embodiments with the explicit understandingthat they will not be used to limit the scope or the meaning of theclaims that follow below.

DETAILED DESCRIPTION

This disclosure provides computer systems and methods that employ asorted array for quick data access and storage. As provided, theconventional sorted array is costly to update as each insertion of a newelement requires shifting of roughly half of the existing elements inthe array, depending on where the new element should be inserted. Thepresent technology, however, in one embodiment, provides a sorted arraythat allocates more space (i.e., positions or cells in the array) thanrequired by the elements it stores. In other words, the array leavesempty spaces between elements such that insertion of a new elementnecessitates the shifting or a small number, or even none, of theexisting elements. Given the ability of such an array to accommodate newelements without the need to increase its size at every insertion, thearray is also referred to as a “sorted elastic array (SEA).”

Even with the increased need of memory space, due to the larger size ofthe array to include empty positions, a SEA still requires much lessmemory than a binary tree for holding the same number of elements. Forinstance, suppose a SEA contains T elements but includes B positions (or“spaces,” “buckets,” “cell” or “slot”) for holding elements, and supposeB is about twice as large as T (i.e., B=2T), then the SEA merely needsenough memory space for holding 2T elements. To hold the same number (T)elements, a binary tree would require T nodes plus at least 2×T pointersor references to connect the nodes as a tree; hence the total memory isat least 3T, not counting that pointers can likely take more memoryspace than a simple data element.

Nevertheless, with the smaller memory space required, search and updateof a SEA is as cost-effective as a binary tree, as described in futuredetails below. Therefore, the present technology is superior to binarytree-based technology in terms of memory space requirement and operationefficiency.

The Load Factor

A SEA of the present disclosure, for instance, includes B positionsindexed from 0, 1, . . . B−1. At a specific moment, the SEA contains atotal number of T actual elements stored, where T<B.

Assuming the T elements are to be randomly distributed to B memorycells. The probability that a memory cell would contain t number of dataelements is given by the Binomial distribution:

${P(t)} = {\frac{T!}{{t!}{\left( {T - t} \right)!}}\left( \frac{1}{B} \right)^{t}\left( {1 - \frac{1}{B}} \right)^{T - t}}$

When T and B are both large, i.e., T>>1, B>>1, the Binomial probabilitycan be approximated by the Poisson distribution:

P(t)˜=(T/B)^(t) e ^(−T/B) /t!=f ^(t) e ^(−f) /t!

where f=T/B and is denoted the “load factor,” and e is the exponentialfunction.The statistical mean of t, denoted by E(t), of Poisson distribution is:

E(t)=f

and the standard deviation is:

σ(t)=√{square root over (f)}.

From the above formulas, it can be seen that when the load factor fdecreases, the average number of data elements contained in a positiondecreases too. When the load factor f is 0.5, then on average, every twomemory cells would contain just one element and the standard deviationis only 0.71 which is a narrow range.

During an insertion or deletion operation, if the load factor goesbeyond (too big or too small) a desired range, the array can be enlargedor shrunk, to maintain an optimal load factor for the array.

Creation of a Sorted Elastic Array and Insertion of Elements

Take an array that contains elements from left to right in ascendingorder as example. It is understood, however, the direction “left” and“right” impose no limitation as they are used as relative terms forconvenience of illustration.

At step 1, a computer system of the present disclosure allocates memoryfor an initial array of size B, B being a non-zero integer. The systemmarks all positions in the array as empty.

At step 2, a first element is inserted into any position, preferably thecenter position, in the initial array.

At step 3, a new element is inserted. First, a binary search isperformed in the current array to find the appropriate position, p, forthe new element to be inserted at so that order of all the elements inthe array is preserved. In other words, the p is right to the elementthat is immediately smaller than the new element (i.e., greatest amongall that are smaller than the new element) and left to the element thatis immediately greater than the new element (i.e., smallest among allthat are greater than the new element).

During the binary search, the first, last, and midpoint pointers mayoccasionally point to empty cells. In such cases, move them to the nextnon-empty positions. Once the right position is found, insert the newelement as follows.

If the cell at position p is empty and is flanked by two non-emptypositions, each occupied by the two immediately adjacent elements,insert the new element right into position pointed by p.

If the cell at position p is not empty, or in other words, the twoimmediately adjacent elements occupy positions right next to each otherand leave no empty position in between, then the array shifts one of thetwo elements further down their respective side. Sometimes, either orboth of the two elements have no immediately downstream (further downright or left) empty position, such that shifting of either of them willrequire shifting of all of the immediately downstream elements. In thatrespect, therefore, the one whose shifting requires the smaller numberof shifting of immediately downstream elements is preferred.

Sometimes, the new element may be smaller or greater than all existingelements, so that it will be inserted at either end (most left or mostright) of the array in any of the empty positions there. Still, the newelement is preferably inserted at a position that is at the center ofthe consecutive empty positions at the end. In some aspects, the newelement can be inserted with a gap that has width equal to 1/f (theinverse of the load factor).

During shifting of elements, if the number of elements to be shiftedexceeds a predetermined threshold and there exists enough number ofempty cells next to these elements, then the array disperses theelements among the empty cells so that there may be gaps (emptyposition) between the elements. In one aspect, the predeterminedthreshold is 3, 4, 5, 6, 7, 8, 9, or 10 or is about 1%, 2%, 3%, 4%, 5%,10%, or 15% of the size (B) of the array.

When the number of elements in the current array exceeds a predeterminedthreshold (determined by the load factor f), the current array can beaugmented or replaced by a new array of greater size by allocating morememory. The size of the new array is denoted by L (see FIG. 3), augmentfactor is denoted by g, so L=gB, where B is the old size. Non-used cellsin the new array are marked as empty. In some aspects, the predeterminedthreshold of f is at least 0.5, or at least 0.5, 0.55, 0.65, 0.7, 0.75,0.8, 0.85 or 0.9. In some aspects, the predetermined threshold of f isnot greater than 0.95, 0.9, 0.85, or 0.8. In some aspects, g is at least1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9 or 2. In some aspect, g isnot greater than 2.5, 2.4, 2.3, 2.2, 2.1, 2, 1.9, 1.8, 1.7, 1.6 or 1.5.

Once the array is enlarged, it can be remapped so that the elements inthe array are more evenly distributed in the new array (see FIG. 4). Insome aspects, the remapping procedure starts from the right most elementin the array to avoid overwrites and data loss.

Deletion of an Element

Deletion of an element can be implemented by identification of theelement at a position and then marking the position as empty. Noadditional operation, such as shifting, may be needed, as the array ofthe present disclosure allows empty positions.

In the event the new load factor becomes smaller than a threshold, thenelements can be remapped (shifted) to a narrower range, to leave aconsecutive number of empty positions at either end of the array. Bydeletion of these empty positions, the new and shrunk array will have anincreased load factor. In some aspects, the predetermined threshold of fis at least 0.1, or at least 0.2, 0.3, 0.4, 0.5, 0.55, 0.65, 0.7, 0.75,0.8, 0.85 or 0.9. In some aspects, the predetermined threshold of f isnot greater than 0.95, 0.9, 0.85, 0.8, 0.75, or 0.7. The size of the newarray is denoted by S, augment factor is denoted by g, so B=gS, where Bis the old size. In some aspects, g is at least 1.1, 1.2, 1.3, 1.4, 1.5,1.6, 1.7, 1.8, 1.9 or 2. In some aspect, g is not greater than 2.5, 2.4,2.3, 2.2, 2.1, 2, 1.9, 1.8, 1.7, 1.6 or 1.5.

The Worst Case Scenario

When creating a new SEA, a worst-case scenario is realized when theinput elements are pre-sorted. In such a case, new elements are alwaysadded to the right end or left end of the array and eventually the shiftoperations will dominate and SEA will behave like a regular sortedarray. In such a scenario, the cost of insertion is O(N).

To avoid the worst-case scenario, one can maintain an orderlinessmeasurement, namely Ω, to measure the degree of orderliness of theincoming elements. To quantify the measure, let Ω=1 when input elementsare in total ascending order; Ω=−1 when input elements are in totaldescending order; Ω=0 when input elements randomly enters the system.One embodiment of designing the orderliness Ω is:

Ω=(S _(i) −S _(d))/N

where S_(i) is the accumulative counts that an input element is greaterthan its preceding input element, and S_(d) is the accumulative countsthat an input element is less than its preceding input element. Ω takescontinuous value in the range −1<=Ω<=1.

With this orderliness parameter, one can modify the insertion method inthe random data input case. The goal is that for ascending orderelements, the array inserts and remaps existing elements toward the leftend of the array and appends new elements toward the right end; fordescending elements, it can insert and remap existing elements towardthe right end of the array so new elements are inserted toward the leftend.

Two control parameters may be used for the modified remapping: one isthe span of the newly mapped elements, denoted by W; the other is theposition of the center of the mass of all the newly mapped elements,denoted by C.

One embodiment is using a linear model:

Let W=a|Ω|+b and L be the size of the new augmented array. Notation |Ω|means the absolute value of Ω. In the case when input elements are intotal random order, i.e., Ω=0, W should be equal to L, hence

L=a|0|+b;

so b is solved as b=L.

At Ω=1, i.e., the input elements are in total ascending order, W may beequal to two times the number of elements in the array (one gap existsbetween adjacent elements):

2fB=a+b

where B is the size of the old array: L=gB.One can solve for a: a=2fB−b=(2fL/g)−L=(2f/g−1)L.

Now one can obtain the full expression for W:

W=a|Ω|+b=(2f/g−1)L|Ω|+L=[(2f/g−1)|Ω|+1]L

For parameter C, a linear model can be used too:

Assume C=a+b

At Ω=0, let C equal to L/2: so b=L/2.At Ω=1, let C equal to the total elements in the array:

C=T=fB=(fL)/g

So a=C−L/2=(2f/g−1)L/2

Now the full expression for C is:

C=[(2f/g−1)Ω+1]L/2

A derived parameter from W and C is the starting position for the newlymapped elements:

$\begin{matrix}{S = {C - {W/2}}} \\{= {\left\lbrack {\left( {1 - {2{f/g}}} \right)\left( {{\Omega } - \Omega} \right)} \right\rbrack {L/2}}}\end{matrix}$

The ending position of the mapped elements is:

$\begin{matrix}{E = {C + {W/2}}} \\{= {L - {\left\lbrack {\left( {1 - {2{f/g}}} \right)\left( {{\Omega } + \Omega} \right)} \right\rbrack {L/2}}}}\end{matrix}$

So for any value of Ω, one can spread the elements to the range [S, E]in the new resized array uniformly.

Data Types in an SEA

A SEA of the present disclosure can be used to store any data types thatcan be sorted. For instance, the data elements in the SEA can be querykeys, key and value pairs, or any other types of data. FIG. 1illustrates a SEA that contains four data elements and five emptypositions, with a total of nine positions. The stored elements, e1-e4,can be end data, such as numbers. In some aspect, the data elementsstored in the positions are numbers, character strings, or block data,without limitation.

In another aspect, the elements can be keys (r1-r4 in FIG. 2)referencing other values (e1-e4). If references are stored, followingthe reference one can gain access to actual data. Shifting operation mayjust involve operations on the references, not the actual data.

Look-up Hash Table (Index Array)

A look-up hash table may be maintained for fast key lookup in the array.The size of the hash table is at least as the same size as the elasticarray itself (FIG. 5). With the hash table, the elements in the arraycan be indexed, which can give rise to an O(1) quick access to theelements in the array.

With reference to FIG. 5, an SEA (A) contains four elements, a1, a3, a4and a7, where the numbers 1, 3, 4 and 7 reflect their positions in thearray. An index array (I), is also created in the memory, having alength (i.e., number of slots) N_(I) that is greater than the number ofelements in array A. In one embodiment, the arrays A and I have the samelength.

For every element in array A, array I contains at position p an integeri corresponding to the element. Position p is the output of a hashfunction taking the value of the element as input.

For example, if hash(a7)=0, then p is 0. Then, at position p, what isstored is the position (index) of a7, which is 7.

Therefore, when searching for a7, the computer can (1) first conduct ahash function operation of the a7, getting a result of 0, (2) look intoposition 0 of array I, getting a value 7, and (3) find a7 at position 7of array A. As such, no binary search is required, and the cost of sucha search is O(1), even much quicker than a conventional array.

To maximize the value of the index array, it is preferred that a hashfunction is chosen so that the hashed values of all elements in array Aare integers between 0 and the length of I−1, and there is no overlap.

In the event multiple elements in array A are mapped to the sameposition in I, then linear probing may be resorted to resolve the hashcollision. When a new element is inserted into array A, a new element isalso inserted into the index array, at a position p with a value asexplained above.

Such an index array with a corresponding hash function can greatlyimprove the performance of a SEA. For instance, in an elastic array ofsize N, an 8 byte reference (addressing type) is used and each referencecan refer to a 32 byte data. Suppose one chooses 0.75 as the load factorthreshold (f=0.75), and 2 as the augment factor (g=2). The storage sizewithout the hash table is (0.75N×32+8N)=32N. The overhead size of addingthe hash table is 8N, giving 8/32=25% extra storage. Considering theO(1) cost benefit of key look-up, in some applications 25% extra memoryis warranted.

Computer Systems and Network

The methodology described here can be implemented on a computer systemor network. A suitable computer system can include at least a processorand memory; optionally, a computer-readable medium that stores computercode for execution by the processor. Once the code is executed, thecomputer system carries out the described methodology.

In this regard, a “processor” is an electronic circuit that can executecomputer programs. Suitable processors are exemplified by but are notlimited to central processing units, microprocessors, graphicsprocessing units, physics processing units, digital signal processors,network processors, front end processors, coprocessors, data processorsand audio processors. The term “memory” connotes an electrical devicethat stores data for retrieval. In one aspect, therefore, a suitablememory is a computer unit that preserves data and assists computation.More generally, suitable methods and devices for providing the requisitenetwork data transmission are known.

Also contemplated is a non-transitory computer readable medium thatincludes executable code for carrying out the described methodology. Incertain embodiments, the medium further contains data or databasesneeded for such methodology.

Embodiments can include program products comprising non-transitorymachine-readable storage media for carrying or having machine-executableinstructions or data structures stored thereon. Such machine-readablemedia may be any available media that may be accessed by a generalpurpose or special purpose computer or other machine with a processor.By way of example, such machine-readable storage media may comprise RAM,ROM, EPROM, EEPROM, CD-ROM or other optical disk storage, magnetic diskstorage or other magnetic storage devices, or any other medium which maybe used to store desired program code in the form of machine-executableinstructions or data structures and which may be accessed by a generalpurpose or special purpose computer or other machine with a processor.Combinations of the above also come within the scope of“machine-readable media.” Machine-executable instructions comprise, forexample, instructions and data that cause a general purpose computer,special-purpose computer or special-purpose processing machine(s) toperform a certain function or group of functions.

Embodiments of the present disclosure have been described in the generalcontext of method steps which may be implemented in one embodiment by aprogram product including machine-executable instructions, such asprogram code, for example in the form of program modules executed bymachines in networked environments. Generally, program modules includeroutines, programs, logics, objects, components, data structures, etc.that perform particular tasks or implement particular abstract datatypes. Machine-executable instructions, associated data structures, andprogram modules represent examples of program code for executing stepsof the methods disclosed herein. The particular sequence of suchexecutable instructions or associated data structures represent examplesof corresponding acts for implementing the functions described in suchsteps.

As previously indicated, embodiments of the present disclosure may bepracticed in a networked environment using logical connections to one ormore remote computers having processors. Those skilled in the art willappreciate that such network computing environments may encompass manytypes of computers, including personal computers, hand-held devices,multi-processor systems, microprocessor-based or programmable consumerelectronics, network PCs, minicomputers, mainframe computers, and so on.Embodiments of the disclosure also may be practiced in distributed andcloud computing environments where tasks are performed by local andremote processing devices that are linked, by hardwired links, bywireless links or by a combination of hardwired or wireless links,through a communications network. In a distributed computingenvironment, program modules may be located in both local and remotememory storage devices.

Although the discussions above may refer to a specific order andcomposition of method steps, it is understood that the order of thesesteps may differ from what is described. For example, two or more stepsmay be performed concurrently or with partial concurrence. Also, somemethod steps that are performed as discrete steps may be combined, stepsbeing performed as a combined step may be separated into discrete steps,the sequence of certain processes may be reversed or otherwise varied,and the nature or number of discrete processes may be altered or varied.The order or sequence of any element or apparatus may be varied orsubstituted according to alternative embodiments. Accordingly, all suchmodifications are intended to be included within the scope of thepresent disclosure. Such variations will depend on the software andhardware systems chosen and on designer choice. It is understood thatall such variations are within the scope of the disclosure. Likewise,software and web implementations of the present disclosure could beaccomplished with standard programming techniques with rule based logicand other logic to accomplish the various database searching steps,correlation steps, comparison steps and decision steps.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this disclosure belongs.

The disclosures illustratively described herein may suitably bepracticed in the absence of any element or elements, limitation orlimitations, not specifically disclosed here. For example, the terms“comprising”, “including,” containing,” etc. shall be read expansivelyand without limitation. Additionally, the terms and expressions employedhere have been used as terms of description and not of limitation;hence, the use of such terms and expressions does not evidence andintention to exclude any equivalents of the features shown and describedor of portions thereof. Rather, it is recognized that variousmodifications are possible within the scope of the disclosure claimed.

By the same token, while the present disclosure has been specificallydisclosed by preferred embodiments and optional features, theknowledgeable reader will apprehend modification, improvement andvariation of the subject matter embodied here. These modifications,improvements and variations are considered within the scope of thedisclosure.

The disclosure has been described broadly and generically here. Each ofthe narrower species and subgeneric groupings falling within the genericdisclosure also form part of the disclosure. This includes the genericdescription of the disclosure with a proviso or negative limitationremoving any subject matter from the genus, regardless of whether or notthe excised material is described specifically.

Where features or aspects of the disclosure are described by referenceto a Markush group, the disclosure also is described thereby in terms ofany individual member or subgroup of members of the Markush group.

All publications, patent applications, patents, and other referencesmentioned herein are expressly incorporated by reference in theirentirety, to the same extent as if each were incorporated by referenceindividually. In case of conflict, the present specification, includingdefinitions, will control.

Although the disclosure has been described in conjunction with theabove-mentioned embodiments, the foregoing description and examples areintended to illustrate and not limit the scope of the disclosure. Otheraspects, advantages and modifications within the scope of the disclosurewill be apparent to those skilled in the art to which the disclosurepertains.

1. A computer system for inserting in an input data element to an array,comprising a processor, memory and program instructions which, whenexecuted by the processor, configure the system to: (a) load an arrayinto memory, wherein the array has a size of B for storing, at amaximum, B data elements, wherein the array contains T data elements,and wherein the data elements are sorted in the array such that anyposition in the array, if not empty, contains a data element that isgreater than all data elements stored in positions on the left andsmaller than all data elements stored in positions on the right; (b)identify, in the array, a first data element as the greatest among alldata elements smaller than the input data element and a second dataelement as the smallest among all data elements greater than the inputdata element, wherein the first and second data elements are adjacent toeach other or separated by one or more empty positions therebetween; and(c) (i) if the first and second data elements are located in adjacentpositions, shift the first data elements and all adjacent data elementson the left to one position to the left and then placing the input dataelement in the position vacated by the first data element, or shift thesecond data elements and all adjacent data elements on the right to oneposition to the right and then placing the input data element in theposition vacated by the second data element, or (ii) place the inputdata element in a position between the first and second data elements.2. The computer system of claim 1, wherein B is at least 20% greaterthan T.
 3. The computer system of claim 1, wherein the first and seconddata elements are separated by at least three empty positions and theinput data element is placed in the middle position among the emptypositions.
 4. The computer system of claim 1, wherein the programinstructions further configure the system to, when a ratio f=T/B isgreater than a predetermined threshold, enlarge the size of the array toL by allocating memory for L-B additional positions to the array.
 5. Thecomputer system of claim 4, wherein the predetermined threshold isbetween 0.5 and 0.9.
 6. The computer system of claim 4, wherein theprogram instructions further configure the system to shift one or moredata elements in the array so that empty positions in the array are moreevenly distributed.
 7. The computer system of claim 1, wherein theprogram instructions further configure the system to repeat steps (b)and (c) for one or more new input data elements.
 8. The computer systemof claim 7, wherein, for different input data elements, both steps (i)and (ii) are carried out.
 9. The computer system of claim 1, wherein theprogram instructions further configure the system to insert into asecond index array, at position p, a value i, wherein p is a hashfunction output of the input data element and i is the position of theinput data element in the array.
 10. The computer system of claim 9,wherein the array and the second index array are of the same length. 11.The computer system of claim 9, wherein the hash function generates aunique non-negative integer value for each data element in the array.12. A computer system for deleting in a query data element from anarray, comprising a processor, memory and program instructions which,when executed by the processor, configure the system to: (a) access anarray that contains T data elements, wherein the array has a size of Bfor storing, at a maximum, B data elements, and wherein the dataelements are sorted in the array such that any position in the array, ifnot empty, contains a data element that is greater than all dataelements stored in positions on the left and smaller than all dataelements stored in positions on the right; (d) identify a position inthe array for containing a data element equal to the query data element;and (c) mark the position as empty.
 13. The computer system of claim 12,wherein the program instructions further configure the system to, when aratio f=T/B is smaller than a predetermined threshold, shift dataelements in the array so that one end of the array has one or more emptypositions, and deleting the empty positions from the array, therebyshrinking the size of the array.
 14. The computer system of claim 13,wherein the predetermined threshold is between 0.1 and 0.7.
 15. Thecomputer system of claim 13, wherein the number of empty position at theend is at least 10% of B.
 16. A method of inserting in an input dataelement to an array, comprising: (a) loading, by a computer processor,an array into a computer-readable memory, wherein the array has a sizeof B for storing, at a maximum, B data elements, wherein the arraycontains T data elements, and wherein the data elements are sorted inthe array such that any position in the array, if not empty, contains adata element that is greater than all data elements stored in positionson the left and smaller than all data elements stored in positions onthe right; (b) identifying, in the array, a first data element as thegreatest among all data elements smaller than the input data element anda second data element as the smallest among all data elements greaterthan the input data element, wherein the first and second data elementsare adjacent to each other or separated by one or more empty positionstherebetween; and (c) (i) if the first and second data elements arelocated in adjacent positions, shifting the first data elements and alladjacent data elements on the left to one position to the left and thenplacing the input data element in the position vacated by the first dataelement, or shifting the second data elements and all adjacent dataelements on the right to one position to the right and then placing theinput data element in the position vacated by the second data element,or (ii) placing the input data element in a position between the firstand second data elements.